My initial corpus consisted of my Top Tracks of 2019 and 2020 and the Top Tracks NL of 2019 and 2020. I wanted to analyze how ‘average’ my taste in music was, and how it changed during the COVID-19 pandemic. Due to the social distancing measures I did not listen to music in any social setting, such as hanging out with friends, clubbing, or even working out at the gym.
After seven weeks of homework assignments, disappointing results and hence disappointing data visualizations, I decided to change my corpus. The Top Tracks NL playlists were too ‘general’ to really capture any meaning or essence of ‘the average Dutch taste in music’.
My new corpus consists of 510 tracks from 7 playlists in total. I still use my Top Tracks of 2019 and 2020, but I have added my own 2021 playlist. This way, 2019 represents the pre-pandemic situation and 2021 represent the one-year mark. The other four playlists are playlists of the NL top tracks of 2020, but one for each of the following genres: pop, dance, hip-hop/R&B and indie/rock. These genre playlists are shown below.
I want to analyze the following about this new corpus:
1. In terms of genre, how can my taste in music best be described?
2. How has my taste in music evolved during the COVID-19 pandemic?
It is important to note that the playlists do not contain the same amount of tracks. The four genre playlists each contain 50 tracks. My 2019 and 2020 playlists contain 100 tracks and my 2021 playlist contains 110 tracks. I deciced not to take out the extra 10, because there was no satisfactory objective way of doing so.
It is also important to note that my 2021 playlist is the only playlist I have compiled myself, the others were compiled by Spotify. It might be the case that Spotify biases the playlists by including certain –possibly sponsored– songs. Although I do not consider this plausible, it is important to keep in mind.
More plausible is the possibility of my personal Top Tracks playlists being biased, for different reasons however. First, I do not have a premium Spotify account, which means I get a limited amount of skips per hour and most playlists can only be played on shuffle. It might be that a song ended up a Top Track because it was in a playlist I listened to a lot, not because I liked that song so much. This limitation, however, only applies when listening to Spotify on my phone. The desktop version of Spotify does allow for infinite skips and the freedom to choose songs manually, put songs in the waiting list, and play a playlist on shuffle or in order.
A great example of a dubious song is “Stuck with U” by Ariana Grande and Justin Bieber. It is one of my Top Tracks of 2020. It used to be in a lot of different playlists I listened to at the time. And yes, I liked that song, but nevertheless, I am fairly sure there were other songs I liked more in 2020. I definitely consider this song atypical for my taste in music in 2020.
Contrarily, a song I consider very typical for my taste is “Drive and Disconnect” by Nao. I remember listening to this song on repeat when I discovered it early 2019, but also for a longer time after that. And even a few months later I rediscovered this song, and fell in love all over again. Even now, it is still one of my favorite songs.
Scatter plot of energy versus valence with mean and SD per playlist
The means are represented by the bigger dots, the SDs by the line segments.
Overall, I am really not surprised by the results of this energy-valence scatter plot. It is pretty much what I expected about these features. At frist, I was surprised that my playlists are so diverse. Later, I realized that I indeed listen to all kinds of emotion in music.
My personal playlists seem to resemble each other the most. Their means, as well as their scattering, are most similar to the pop genre The shape of the 2021 scattering, however, seems to have slightly shifted toward hip-hop/R&B. I already suspected it would, because I have been discovering some nice new R&B artists lately.
The genres dance and indie/rock look least like my playlist, which also makes sense, since I don’t listen to that much. It is very interesting to see that indie/rock is so all over the place. I am not a real indie-expert, so I would not know if this was to be expected. I did expect rock to have some tracks in the upper-left corner, which it does indeed. What is so interesting about dance, is that the tracks lie very close to each other in the spectrum. With electronic instrumentation, it makes sense that the energy of dance tracks is overall high.
Scatter plot of danceability versus tempo with mean and SD per playlist The means are represented by the bigger dots, the SDs by the line segments.
What strikes me the most, is that the tracks in my personal playlists all cluster around 100 bpm. When I saw this visualization, I immediately wanted to check it. And indeed, at 100 bpm, my body seems to start moving to the music effortlessly. So unlike what Moelants (2002) showed, my preferred tempo does lie at 100 bpm.
Just like my playlists, the genres hip-hop/R&B and pop also seem to have a cluster of tracks around 100 bpm, although less obvious. Notice that all of their means are slightly higher, since they spread out more to the right. Indie/rock is clustered more around 120 bpm, and dance even more so.
The average danceabilities of my playlists lie between 0.6 and 0.9. The means, of about 0.7 confirm this. Dance tracks are spread out from about 0.5 to 0.9, also with a mean of 0.7. Not surprisingly, pop and indie/rock tracks have a somewhat lower danceability. Hip-hop/R&B tracks, however, have a slightly higher danceability.
In conclusion, regarding tempo and danceability, my taste in music can be characterized by the genres pop and hip-hop/R&B. Also, I have gotten very curious what exactly determines danceability for the Spotify API, since it does not necessarily take tempo into account.
It is immediately apparent that the indie/rock and pop playlists contain relatively more major than minor tracks, as opposed to the other playlists. Also, my 2021 tracks are more likely to be in minor mode than my Top Tracks of 2019 or 2020. As in the energy-valence visualization, this might indicate that my taste in music has slightly shifted towards R&B. As opposed to previous results, this barplot shows that, regarding mode, my taste in music can be least characterized by pop.
This song was one of my favorite tracks of 2020 and a pretty typical as well (see song info). In the bars chromagram, you can clearly see the repetitive chord progression throughout the song – e.g. the skips from C to G. The sections chromagram clearly shows bright ‘blocks’ at D and C. The D-blocks represent the chorus vocals and the C-block represents the vocals in the bridge. The first few blocks at G and F represent the instrumental intro.
I think the overall look of these chromagrams are very representative of the tracks I generally listen to. Speaking from a music theoretic point of view, most songs have a clear verse-chorus structure (as shown by the the sections chromagram) and simple melodies and harmonies (as shown by the bars chromagram). However, the verse-chorus structure of this song is much different than the usual “verse-chorus-verse-chorus-bridge-chorus” from pop songs (see song structure).
| Feature | Average in 2020 | “Sweetie Odo” |
|---|---|---|
| Key | F# major | G minor |
| Time Signature | 4 | 4 |
| BPM | 111.84 | 100 |
| Energy | 0.64 | 0.49 |
| Valence | 0.53 | 0.53 |
| Danceability | 0.7 | 0.84 |
| Time (s) | Section | Instrumentation |
|---|---|---|
| 0-9 | intro | guitar |
| 10-19 | intro | + percussion |
| 19-38 | intro | + voice |
| 38-57 | chorus | |
| 57-76 | post-chorus | |
| 76-96 | verse | |
| 96-114 | bridge | |
| 114-134 | chorus | |
| 134-155 | outro / post-chorus | - guitar |
This song was one of my favorite tracks of 2020 and a pretty typical as well (see song info). Both cepstrograms clearly show the intro and outro in the third timbre component. The verses is also brighter in c03. The chorus sections, however, light up in c05. I wonder what this component represents. The second timbre component most clearly shows the bridge and the choruses somewhat.
What is most representative about this song, and clearly visualized in these cepstrograms, is that most songs I listen to have an intro and outro. Also, I believe I do not listen to many songs where the loudness – the first timbre component – changes a lot.
| Feature | Average in 2020 | “Sweetie Odo” |
|---|---|---|
| Time Signature | 4 | 4 |
| BPM | 111.84 | 100 |
| Energy | 0.64 | 0.49 |
| Valence | 0.53 | 0.53 |
| Danceability | 0.7 | 0.84 |
| Time (s) | Section | Instrumentation |
|---|---|---|
| 0-9 | intro | guitar |
| 10-19 | intro | + percussion |
| 19-38 | intro | + voice |
| 38-57 | chorus | |
| 57-76 | post-chorus | |
| 76-96 | verse | |
| 96-114 | bridge | |
| 114-134 | chorus | |
| 134-155 | outro / post-chorus | - guitar |
Because this song produced remarkably nice self-similarity matrices, I decided to analyze it once more, but now focusing on homogeneity rather than the specific chroma and timbre features.
Both matrices clearly show (parts of) the song’s structure (see below). The chroma matrix clearly shows the instrumental intro (first strip), the vocal intro, verse and outro (light strips), and the choruses with bridge (dark squares). It makes sense that both choruses are similar in pitch. Interestingly, the bridge resembles them. I could not figure out why, not by ear at least. Also interesting to see is that the second post-chorus is not similar to the first post-chorus. When you listen to the song, however, it does make a lot of sense; in the outro, the guitar stops playing.
The timbre matrix also clearly shows the outro, for the same reason actually. Timbre is partly determined by instrumentation, so it makes sense that the outro (without guitar) resembles no other part of the song (with guitar). Also notable is the dark square at the bottom-left corner of the timbre matrix. This represents the percussion coming in. In general, all squares in the timbre matrix can be ascribed to the homogeneity of the percussion. A new square indicates a change or short pause in the percussion – even the slight changes just before 60 and 120 seconds represent a short percussion break of about a bar.
Also interesting, the vocal intro and the verse do not seem similar in pitch, yet they have the same repetitions (riffs probably). They are, overall, more similar in timbre. The bridge and choruses seem less similar in timbre than in pitch.
| Time (s) | Section | Instrumentation |
|---|---|---|
| 0-9 | intro | guitar |
| 10-19 | intro | + percussion |
| 19-38 | intro | + voice |
| 38-57 | chorus | |
| 57-76 | post-chorus | |
| 76-96 | verse | |
| 96-114 | bridge | |
| 114-134 | chorus | |
| 134-155 | outro / post-chorus | - guitar |
| class | precision | recall |
|---|---|---|
| 2019 | 0.4036697 | 0.44 |
| 2020 | 0.3033708 | 0.27 |
| 2021 | 0.3928571 | 0.40 |
On the left, you see a confusion matrix of a \(k\)-nearest neighbor classifier trying to classify my three personal playlists. As you can see, it performs rather okay on tracks from 2019 and 2021, yet very bad on tracks from 2020 (also see precision and recall in the table above). This could mean that my taste in music has shifted over the course of these three years and tracks from 2020 fall right in the middle. As a result, the classifier can only distinguish two classes and does not know what to do with 2020 tracks. This ‘shift’ from 2019 and 2021 has not been as clear in previous visualizations, where usually only 2021 showed a small difference from the other two.
The \(k\)-nearest neighbor classifier uses \(k=1\). I have tried several values of \(k\), but none produced significantly better results. Also, the random forest classifier performed significantly worse than this classifier. It could not at all predict the classes of 2019 and 2020 – the prediction for both was about 33 tracks in each class. It did, however, perform better on the 2021 class. Maybe the random forest classifier was more heavily influenced by my 2021 playlist containing 110 tracks, as opposed to the 100 tracks in my 2019 and 2020 playlists. This makes sense when comparing it to the \(k\)-nearest neighbor classifier, because that one only considers the \(k\)-nearest neighbors and does not take ‘the whole’ into account.
In most visualizations we saw a clear similarity between my personal playlists and the genres pop and R&B. My 2021 playlist usually shifted a little more towards R&B.
As the results have shown, I have started listening to more R&B songs in 2021. Also 2021 tracks were more often in minor than tracks from 2019 and 2020. Also according to the \(k\)-nearest neighbor classifier, my music taste has indeed shifted since 2019. It shows that tracks from 2020 were hard to classify, probably because 2019 and 2021 were so clearly different from each other. ### Outlook This corpus research has actually been for personal interest only. It might be nice to analyze music trends over the COVID-19 pandemic in general, but my corpus would in no way suffice for that analysis.
Recently, I have started listening to more Dutch music. I wonder whether, in the future, I can research such ‘language trends’ as well. I have also started wondering whether I have seasonal preferences. I could imagine the following seasonal pattern:
| Time of the year | Winter | Summer |
|---|---|---|
| Energy-valence | relaxed | upbeat |
| Genre | R&B-ish | Latin and pop |
| Mode | minor | major |